Multi-Agent Email Forensics System
The Multi-Agent Email Forensics System is an intelligent agent-based solution for automated security analysis of email communications. The system employs four specialized autonomous agents working in a coordinated pipeline to discover, analyze, visualize, and report on potential security threats.
The system implements a cooperative multi-agent architecture where four specialized agents work sequentially in a pipeline:
┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐ ┌────────────────┐
│ Discovery │────▶│ Analysis │────▶│ Dashboard │────▶│ Report │
│ Agent │ │ Agent │ │ Agent │ │ Agent │
└─────────────────┘ └──────────────┘ └─────────────────┘ └────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
Email Files Findings List Visualizations Reports
Role: Autonomous data acquisition
Locates email files and parses them into structured objects
Pattern: Repository
Role: Threat detection
Applies 4 parallel detection strategies
Pattern: Strategy
Role: Visual analytics
Creates 8 visualization types
Pattern: Factory
Role: Report generation
Produces text and HTML reports
Pattern: Template Method
EnhancedEmailGenerator creates 50 test emails (30% suspicious, 70% normal) with realistic subjects, timestamps, and sender domains.
Output: Email files in output/emails/
DiscoveryAgent autonomously locates and loads emails, parsing structure (ID, Subject, From, To, Date, Content) into SimpleEmail objects.
Output: List of SimpleEmail objects
AnalysisAgent applies four parallel detection strategies:
Output: List of Finding objects with severity classification (High/Medium/Low)
DashboardAgent generates 8 visualizations using matplotlib, seaborn, and wordcloud:
Output: PNG files in output/visualizations/
ReportAgent consolidates results using Jinja2 templates into text and HTML formats with embedded visualizations.
Output: forensics_report.html and forensics_report.txt in output/reports/
Auto-generates PlantUML class and sequence diagrams documenting system architecture.
Output: Documentation in output/uml_documentation/
SimpleEmail: Email dataclass with fields id, subject, sender, recipient, date, content, file_path
Methods:
is_suspicious() - Checks for 21 suspicious keywordsis_after_hours() - Detects outside business hours (8 AM - 6 PM)is_external() - Checks if sender is from external domainFinding: Investigation finding with finding_type, description, email_id, severity, timestamp
29 tests, 100% passing:
Building this multi-agent system from scratch revealed the practical value of autonomous, specialized agents working in coordinated pipelines. The separation of concerns between discovery, analysis, visualization, and reporting made the system easier to develop, test, and maintain.
Implementing four detection strategies taught me that effective security analysis requires multiple perspectives. Keyword detection alone misses after-hours anomalies, while temporal analysis misses sophisticated phishing from normal hours. The comprehensive testing (29 tests, 100% passing) caught integration bugs early, particularly in data flow between agents.
This project connected theoretical concepts about agent architectures (Wooldridge, 2009) with practical concerns like error handling and user experience. Generating professional visualizations taught me that intelligent systems must present results in accessible formats for non-technical stakeholders.